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Abstract 
Regression lines for the prediction of WRAT standard scores by, 
Stanford-Binet IQs were compared across race, by the Potthoff 
procedure, for equated groups (sex, age, and IQ) of 80 Black and 
60 White ‘children referred for “psychological wecvines by their 
- classroom teachers. Regression lines for Blacks and Whites did 
not differ significantly for the prediction of WRAT scores by 


the Stanford-Binet. Implications of these findings are discussed. 
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The use: of psycholg ical. tests normed primarily # with white 
children for psychological “lagnosis and educatj nal decision- 
making concerning minority children, nag: ec a very hotly con- 
tested issue in, ‘recent - years. | ALthough ‘ach discussion of .the 
issue has appeared in both the scientific and public literature, 
few (Gata of relevance to the nia school age children) 


ha | 


een presented. The use of Buch tests are of special goncern 
ro psychologists involved in assessment, particularly in/view 
the Larry P. case (Note 1) and P. L.. 94-142 (Note 2; Education. 
for All Handicapped chilactn Act of 1975). Harrington (1976, 
1978) has gone 50 far s to'state that it is not possible for 
tests developed and slorned on a waite majority to be other than 
biased against,’ {ginorities and to. a less predictive validity 
when used with minorities. a 4 P 

In response to pregsuré from the Black Psychological Association 
(which was pereny reqdenttig a moratorium on the’ use of psycholog- 
ical dnd ¢ educat ial tests with disedvantaged students), the APA’ 
Board of Di eetors requested, in RGR the Board of Scientific | 
Affairs,to apnodnt a group to study thé use of such tests with dis- 
adygotaged students. In-reporting on this issue, the committee 
‘ “Cleary, Humphreys, Kendrick, & Wesman, 1975) offered a definition 
of test bias. While iaeiuaue eunae and construct validity as ‘ 
important variables in the issue of test bias, the focus was clearly 


‘ 


er on predictive validity: ~ ; : 


a 


A test is considered fair for | a particular use 
if the inference drawn from the test score is made 
with the smallest feasible random error and if there: 


is no constant error in -the inference as a function Z 
of membership in a partécular group: RELORE et al., a 3 
1975, p.25) g* 


_ The definition of bias offered by the APA spunea eee is a restatey 
ment’ of previous definitions by Cardall and| Coffman (1964), ri 
Cleary (1968), and Potthoff (1966), and has, been widely meen 
(though certainly not’ without criticism, e.g-, Bernal, 1975; 

Linn & Werts, 1971; Thorndike, 1971). Oakland and Matuszek (1977) 
examined class placement procedures under several eucaaned models 
of bias and demonstrated that the Cleary model results in the 
smallest number of children being misplaced, although under certain 
legislative conditions, Oakland and Matuszek favored the Thorndike 
(1971) "quota" selection model. After reviewing a number of 
models, Peterson and Novick (1976) designated the “regression model | 
as the most logically tenable and the most widely used placement 
wide. A statistical technique provided by Potthoff (1966) has 
also received widespread acceptance in the examination of re- 


gression lines to test a under the éladty et al. definition 


(Schmidt & Hunter, 1974). - 
While considerable data are available on the validity of 
the Scholastic Achievement Test (e.g., Goldman & Hewitt, 1976; 
Kallingal, 1971; Pfiefer & Sedlacek, 1971) and various employment 
tests (e.g. Boehm, 1972; Hunter, Schmidt, & Hunter, 1979) for 
4 f . 
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‘ 1949 WISC, Hartlage et al. (1976) found the WISC to have cansis- 


tently larger correlations with measures of reading, spelling, 


“identity of regression lines (equivalent beta coefficients and 


‘across race for: bla s and whites. Their results indicated that 


on : ' ‘Regression 
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blacks and whites, only recently have studies appeared dealing - 


_with differential validity of IQ tests. Mitchell (1967) studied es 


the validity of two broad based readiness tests to predict: first 
grade achievement for blacks and whites finding similar validity 


coefficients for the two races. Mitchell's study was limited t 


comparing the magnitude of independent-dependent variable corr 
lation and did not look for identity of regression lines. Hat js 
ee Lucas, and Godwin (1976) gumared the predictive vatdtty 

of the WISC and Raven with a group of low SES, disadvantaged 

children. When comparing what they considered to be the relatively 4 


culture-fair test, the Raven Matrices, with the "culture-loaded" 


and arithmetic than the Raven. These authors only compared the ~ 


strength of the relationship in each case and did not look for ’ 


intercept constants). 


More recently, Reynolds and Hartlage (1979) compared regression - 


lines for the a of achievement by the WISC ‘and the WISC-R 


regression lines: for blacks and whites did not differ significantly. 
Reynolds and Gutkin (1980) replicagéd the Reynolds and Hart lage 
(1979) study for the WISC-R, paring regression lines between 


whites and Mexican-Americang. Again, no significant differences 
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were found. In a study with much larger samples, Reschly and 
Sabers (1979) investigated the WISC-Rs ability to predict 
Met Yopolitan Achievement Test scores across four ethnic groups 
(blacks, whites, chicanos, and native American Papagos). ‘Reschly | 
-and Sabers (1979) adopted the Cleary, regression definition and | 
a procedure by Gulliksen and Wilks (1950) that separately tests 
slopes and intercepts (whereas the Potthoff, 1966, technique 
situl taneously tests slopes and intercepts). They found that the 
WISC- R was for the most part equally valid for the different 
groups. When differences occurred, they were due to variations 
in intercepts resulting in the over-prediction of performance for. — 
non-white groups. 

| The purpose of the present study is to provide data that will 
aid in the empirical evaluation of test bias (under .the Cleary 
re 1975, definition) for the Stanford-Binet Intelligence 
Scale, Form L-M, 1972 Norms Edition (Terman & Merrill, 1973). It 
wae hypothesized that, as with previous research on the WISC and 
WISC-R, no significant differences would occur between regression 
lines across groups. Previous senearch on bias has ignored the 
Binet. The Binet should be of particular interest in test bias 
research since it has historically been the 1Q test against which 


new tests have been validated. 
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METHOD : a 
Subjects | 
The sample consisted of equated groups of 60 white and 60 

black urban children referred by teachers for psychological eval- 
uation due to a variety. of learning and/or behavior problems. Boe, ™ | Wag 
referral population was chosen because they are the predominant 
group of interest in the prediction of achievement from the IQ. 

- The children were chosen as follows Eros more than ‘1,000 district~ 

A computer listing af all children with complete. 


wide referrals.;/ 
data was obta hci Every third’ black male was chosen until 30 children 


were obtai d. The procedure was repeated for black females. 

’ Since’ random assignment to race or sex is not possible, whites were 
chosen to match the black children on the variables of age. (within . 
6 month), sex, and IQ (within 10 points). To match ‘the groups, a 

“, black child-was owen and-eecdtis Of ‘tie white group examined. 

The first matching white child to be encountered was selected. The 

‘ resulting sample characteristics ave dasesthed tn geeacer devail in 

- \Table 1. The relatively low IQ of the groyps is typical of referral 
' populations (Gutkin & Reynolds, in. ‘press; Reynolds, Gutkin, Dappen, 


& Wright, A979; Reynolds & Hartlage, 1979). “t 
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mestodure ‘5 . 


The ‘Stanford- Binet Intelligence Scale (Terman & Merrill, 1973) 
and the most recent revision of the Wide Range Actrievement Test 
(Jastak & Jastak, 1978) were administered by certified school 
. psychologists and psychological assistants. Testing on both scales 
was accomplished during a single session. . 

Regression lines for wach pair of scores (Binet 1Q Seeaick ing 
‘ each WRAT subtest) were examined across race ‘through the Potthoff 
(1966) technique. thinearoxedive visits a single ¥ - ratio that 
simultaneously tests regression coefficients (slopes) and intercept 
values. If a.significant F results, slopes and intercepts may then 
ia asgeased aeperaceny to determine whether the resulting bias in . 
prediction is constant: la 2 a differ) or changes with the dis- 
tance of Scores from bie mean (slopes differ). Slopes and inter- 
cepts must both be equivalent prior to. conéluding homogeneity of 
regression across groups. Only when slope and intercepts are the 
same can a common regression equation (derived by combining the 
groups in quextion) be applied. If homogeneity of regression across 
groups does not occur, then in ovder to have fair use of test scores, 
separate equations for each group must be employed. | 

| Bis = RESULTS 

Regression lines for blacks and whites did net differ at the 
.05 level of significance for the prediction of WRAT Reading, F 


(2,116) = 1.24 p > .05, Spelling, F (2,116) = 0.18, p > .05, or 
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Arithmetic, F F (2, 116) = 2.24 p> .05, standard scores by the. 

’ Stanford-Binet IQ. has, present “results provide support for the . 
use of a common Fepvensiea equation (Bossard & Galusha, in press) 

- to predict WRAT achievement scores for referred black and white 
“ghitdrén with. the Stanford-Binet. Correlations.between the 
‘Stgpford-Binet IQ and achievement for both groups were wees sub- 
stantial, never accounting for less chan 49% of the variance in 
achievement, scores. For black children the correlations were: 


.74 with Reading, .78 with Spelling, and .70 with Arithmetic. For. 


whites the correlations were: .81 with Reading, .81 with Spelling, 


and .82 with Arithmetic. As expected from the results of the Pott- 
hoff analysis, the pairs of correlations are gudte: wind Yar across 
these two racial groupings. . | 

_ DISCUSSION - 

The study's seine are. consistent with ne dnveutieations 
of test bias using the veeveuston definition. That is, standardized 
. intelligence tests have been shoun*t6 predict school achievement 
about equally well for blacks and whites. Prior to concluding that 
the Stanford-Binet facatidvenve Scale is free of bias in terms of 
predictive accuracy (the regression definition), more research is 
needed utilizing ‘a wide variety of criterion measures eugene 
other individual achievement tests, group achievement tests, and 


teacher constructed scales. Studies of this kind will help to 


evaluate the relative ie Pliahes of pat within different criterion 
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measures. Since using a referral population may minimize differences 
between groups, replication with normal children will also need to” 

_ be undertaken. . ; | 

| Test develonare-néed to become more aware of the issue of. : y} 
bias, to the point of demonstrating validity across groups prior 

to publication of the instrument. ; While this has occurred somewhat 

in the area of achievement testing (Anastasi, 1976), investigations 

. of differential validity by test publishers-are conspicuously lacking. 
‘Studies similar to the present investigation are needed with other 
exigting measurement instruments to_ determine whether alterations 

in interpretation of the wcalen are needed when applied to groups 


. 
- 


other than the majority population. 
At present however, a. considerable body of data is accumulating 

indicating consistency of content (Jensen e Figueroa, 1975), ‘con- 4 

_ struct (Gutkin & Reynolds, ‘in press; Jensen, 1976; RESCH AT: 1978; 

Reynolds, in press a,b ), and predictive (Reschly & Sabers, 1979; 

Reynolds & Gutkin, 1980; Reynolds & Hartlage, 1979) validity 


of the, IQ test acsgss racial groupings. 
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Sample Characteristics by Race and Sex 


Sex. N 

Blacks M 30 
F 30 

Whites M 30 
| F 36 
Sex N 


Blacks M — 30 


Whites M 30° 


84.90 


Age in Years 


Xx 
8.38 
8.53 
8. 30 
8.42 


Table 1 


SD 
eno 
2.70 
2.79 


2.88 © 


Stanford-Binet 


X 
82.82 
83.33 
84.53 


84.16 


SD 


ARS 


20.79 
16.68 
23.99 


Wide Range Achievement Test ' 


Reading 

X SD 
83.43 16.11 
83.30 16.47 
82.97 ~ 16.63 


23.48 


spelling 


X 
83.83 
84.20 
84.77 

85.77 


SD 


18.16 


IQ 
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Aettimetie 
x sp 
84.07 16.42 
2.50 17.95. 
80.83 19.07 
83.47 


23.22 


